111 research outputs found
Action-Conditioned Contrastive Policy Pretraining
Deep visuomotor policy learning achieves promising results in control tasks
such as robotic manipulation and autonomous driving, where the action is
generated from the visual input by the neural policy. However, it requires a
huge number of online interactions with the training environment, which limits
its real-world application. Compared to the popular unsupervised feature
learning for visual recognition, feature pretraining for visuomotor control
tasks is much less explored. In this work, we aim to pretrain policy
representations for driving tasks using hours-long uncurated YouTube videos. A
new contrastive policy pretraining method is developed to learn
action-conditioned features from video frames with action pseudo labels.
Experiments show that the resulting action-conditioned features bring
substantial improvements to the downstream reinforcement learning and imitation
learning tasks, outperforming the weights pretrained from previous unsupervised
learning methods. Code and models will be made publicly available
Optimization-Based Motion Planning for Autonomous Agricultural Vehicles Turning in Constrained Headlands
Headland maneuvering is a crucial aspect of unmanned field operations for
autonomous agricultural vehicles (AAVs). While motion planning for headland
turning in open fields has been extensively studied and integrated into
commercial auto-guidance systems, the existing methods primarily address
scenarios with ample headland space and thus may not work in more constrained
headland geometries. Commercial orchards often contain narrow and irregularly
shaped headlands, which may include static obstacles,rendering the task of
planning a smooth and collision-free turning trajectory difficult. To address
this challenge, we propose an optimization-based motion planning algorithm for
headland turning under geometrical constraints imposed by field geometry and
obstacles
CAT: Closed-loop Adversarial Training for Safe End-to-End Driving
Driving safety is a top priority for autonomous vehicles. Orthogonal to prior
work handling accident-prone traffic events by algorithm designs at the policy
level, we investigate a Closed-loop Adversarial Training (CAT) framework for
safe end-to-end driving in this paper through the lens of environment
augmentation. CAT aims to continuously improve the safety of driving agents by
training the agent on safety-critical scenarios that are dynamically generated
over time. A novel resampling technique is developed to turn log-replay
real-world driving scenarios into safety-critical ones via probabilistic
factorization, where the adversarial traffic generation is modeled as the
multiplication of standard motion prediction sub-problems. Consequently, CAT
can launch more efficient physical attacks compared to existing safety-critical
scenario generation methods and yields a significantly less computational cost
in the iterative learning pipeline. We incorporate CAT into the MetaDrive
simulator and validate our approach on hundreds of driving scenarios imported
from real-world driving datasets. Experimental results demonstrate that CAT can
effectively generate adversarial scenarios countering the agent being trained.
After training, the agent can achieve superior driving safety in both
log-replay and safety-critical traffic scenarios on the held-out test set. Code
and data are available at https://metadriverse.github.io/cat.Comment: 7th Conference on Robot Learning (CoRL 2023
Guarded Policy Optimization with Imperfect Online Demonstrations
The Teacher-Student Framework (TSF) is a reinforcement learning setting where
a teacher agent guards the training of a student agent by intervening and
providing online demonstrations. Assuming optimal, the teacher policy has the
perfect timing and capability to intervene in the learning process of the
student agent, providing safety guarantee and exploration guidance.
Nevertheless, in many real-world settings it is expensive or even impossible to
obtain a well-performing teacher policy. In this work, we relax the assumption
of a well-performing teacher and develop a new method that can incorporate
arbitrary teacher policies with modest or inferior performance. We instantiate
an Off-Policy Reinforcement Learning algorithm, termed Teacher-Student Shared
Control (TS2C), which incorporates teacher intervention based on
trajectory-based value estimation. Theoretical analysis validates that the
proposed TS2C algorithm attains efficient exploration and substantial safety
guarantee without being affected by the teacher's own performance. Experiments
on various continuous control tasks show that our method can exploit teacher
policies at different performance levels while maintaining a low training cost.
Moreover, the student policy surpasses the imperfect teacher policy in terms of
higher accumulated reward in held-out testing environments. Code is available
at https://metadriverse.github.io/TS2C.Comment: Accepted at ICLR 2023 (top 25%
MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning
Driving safely requires multiple capabilities from human and intelligent
agents, such as the generalizability to unseen environments, the safety
awareness of the surrounding traffic, and the decision-making in complex
multi-agent settings. Despite the great success of Reinforcement Learning (RL),
most of the RL research works investigate each capability separately due to the
lack of integrated environments. In this work, we develop a new driving
simulation platform called MetaDrive to support the research of generalizable
reinforcement learning algorithms for machine autonomy. MetaDrive is highly
compositional, which can generate an infinite number of diverse driving
scenarios from both the procedural generation and the real data importing.
Based on MetaDrive, we construct a variety of RL tasks and baselines in both
single-agent and multi-agent settings, including benchmarking generalizability
across unseen scenes, safe exploration, and learning multi-agent traffic. The
generalization experiments conducted on both procedurally generated scenarios
and real-world scenarios show that increasing the diversity and the size of the
training set leads to the improvement of the generalizability of the RL agents.
We further evaluate various safe reinforcement learning and multi-agent
reinforcement learning algorithms in MetaDrive environments and provide the
benchmarks. Source code, documentation, and demo video are available at
https://metadriverse.github.io/metadrive . More research projects based on
MetaDrive simulator are listed at https://metadriverse.github.ioComment: Source code, documentation, and demo video are available at
https://metadriverse.github.io/metadrive . More research projects based on
MetaDrive simulator are listed at https://metadriverse.github.i
ScenarioNet: Open-Source Platform for Large-Scale Traffic Scenario Simulation and Modeling
Large-scale driving datasets such as Waymo Open Dataset and nuScenes
substantially accelerate autonomous driving research, especially for perception
tasks such as 3D detection and trajectory forecasting. Since the driving logs
in these datasets contain HD maps and detailed object annotations which
accurately reflect the real-world complexity of traffic behaviors, we can
harvest a massive number of complex traffic scenarios and recreate their
digital twins in simulation. Compared to the hand-crafted scenarios often used
in existing simulators, data-driven scenarios collected from the real world can
facilitate many research opportunities in machine learning and autonomous
driving. In this work, we present ScenarioNet, an open-source platform for
large-scale traffic scenario modeling and simulation. ScenarioNet defines a
unified scenario description format and collects a large-scale repository of
real-world traffic scenarios from the heterogeneous data in various driving
datasets including Waymo, nuScenes, Lyft L5, and nuPlan datasets. These
scenarios can be further replayed and interacted with in multiple views from
Bird-Eye-View layout to realistic 3D rendering in MetaDrive simulator. This
provides a benchmark for evaluating the safety of autonomous driving stacks in
simulation before their real-world deployment. We further demonstrate the
strengths of ScenarioNet on large-scale scenario generation, imitation
learning, and reinforcement learning in both single-agent and multi-agent
settings. Code, demo videos, and website are available at
https://metadriverse.github.io/scenarionet
- …